We are IntechOpen, the world's leading publisher of Open Access books Built by scientists, for scientists

Open access books available 5,300

130,000 155M

International authors and editors

Downloads

Our authors are among the

most cited scientists 154 TOP 1%

Selection of our books indexed in the Book Citation Index in Web of Science™ Core Collection (BKCI)

# Interested in publishing with us? Contact book.department@intechopen.com

Numbers displayed above are based on latest data collected. For more information visit www.intechopen.com

# **Biomarkers in Rare Genetic Diseases**

# Chiara Scotton and Alessandra Ferlini

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/63354

#### **Abstract**

Biomarkers offer a way to speed up medical research by shedding light on the physiopa‐ thological mechanisms of disease. Furthermore, biomarkers are considered invaluable tools for monitoring disease progression, prognosis, and response to drugs, especially in clinical trials, where they can be used to assess the efficacy, efficiency, and side effects of novel drugs.

Biomarkers also pave the way to personalised medicine, a rapidly developing field that is of particular interest in rare diseases (RDs), i.e. those with a prevalence of less than 5/10,000, which are often genetic in origin. Although rare genetic diseases may be less appealing targets for pharmaceutical companies, they are nevertheless in urgent need of research into their diagnosis, prevention, treatment, and standards of care.

Here we summarise the state of the art in RDs, genetic diagnosis, and novel strategies aimed at accurately identifying and defining gene mutations, and review the evidence emerging from the latest research and clinical trials. We focus in particular on novel biomarkers, describing the different types discovered so far, highlighting their importance and indicating how they may be translated into research, diagnostics, treatment, and preventative applications in personalised strategies for RDs.

**Keywords:** biomarker, rare disease, genetic disease, genomics, transcriptomics, pro‐

# **1. Introduction**

teomics

As each rare disease (RD) only affects a relatively small number of individuals across the globe, there are often great obstacles to their research, diagnosis, treatment, and prevention. In Europe, a disease is considered to be rare, or orphan, when it affects fewer than 5 people in 10,000, in line with the definitions adopted by the European Committee (EC) in their Orphan

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Drugs Regulation N° 141/2000 and Commission Communication COM (2008) 679/2 on RDs: Europe's challenges [1]. However, RDs are often chronic, progressive, degenerative, life‐ threatening and/or severely disabling in terms of a patient's quality of life, frequently leading to a lack or loss of autonomy.

Although infections, allergies, and environmental factors, linked in particular to degenerative and proliferative processes (i.e. auto‐immunity or cancer), may be implicated in the onset of RDs, the vast majority, approximately 80%, is caused by genetic defects (though not all RDs are genetic diseases). The signs and symptoms of many RDs may therefore be observed at birth or during childhood, and, indeed, roughly 75% of RDs affect children, including chondrodys‐ plasia, neurofibromatosis, osteogenesis imperfecta, proximal spinal muscular atrophy and Rett syndrome. The RDs that manifest in adulthood, on the contrary, include amyotrophic lateral sclerosis (ALS), and Charcot‐Marie‐Tooth, Crohn, and Huntington diseases.

Although each individual condition may fit the definition of rare, about 7000 distinct RDs have been identified so far, affecting 6–8% of the global population. In fact, it is estimated that 350 million people worldwide suffer from even rarer conditions, suggesting that one in 20 patients will be affected by an orphan disease [2]. Therefore, collectively, RDs are not at all rare, and as a whole, they generate a considerable socioeconomic burden.

In addition to there being a wide spectrum of RDs, they are also characterised by great variability in the age of onset, signs and symptoms, and patterns of tissue/organ involvement. To further complicate the issue, molecular testing and phenotype analysis reveal that muta‐ tions occurring in the same gene can be associated with different clinical diagnoses, and marked intra‐ and interfamilial phenotype variability has been documented. RDs are therefore often extremely difficult to diagnose, and only about 4000 genes have been identified for the 7000 RDs described in the OMIM database [3]. Understandably, therefore, the IRDiRC [4] has set its members the challenge of diagnosing most, if not all, RDs by 2020, and discovering at least 200 new therapeutic options for their patients.

Nevertheless, without early diagnosis and effective treatment strategies, it is impossible to guarantee any improvement in the quality of life and/or life expectancy of such patients. Furthermore, our lack of knowledge regarding the causes, physiopathological mechanisms, and clinical progression of RDs makes it difficult to apply available treatments and to develop novel therapeutic strategies. In addition, the small number of patients complicates the recruitment of an adequate sample for clinical trials, especially in children, which make up an even smaller percentage of the overall RD population. This is an obvious deterrent to the pharmaceutical industry, which has only limited interest in developing and marketing products for this small consumer base. In order to counter some of these problems, both national and the EU governments have made orphan drug laws and funding a priority, but, despite this recent interest, treatment options are currently only available for 5% of RDs [2].

It is not only RDs that could benefit from more activity in this area, as RD research is also considered pivotal for many common diseases, and has in some cases revealed mechanisms and pathways that have been subsequently associated with other rare or common diseases [5]. Indeed, several RDs have been linked to a high degree of genetic and phenotypic heterogeneity; for example, mutations occurring in the LMNA gene can cause different disease types by affecting different tissues, such as (i) striated muscle (muscular dystrophy such as Emery‐ Dreifuss muscular dystrophy and limb‐girdle muscular dystrophy or dilated cardiomyop‐ athy), (ii) adipose tissue (lipodystrophy syndromes), (iii) peripheral nerve (peripheral neuropathy such as Charcot‐Marie‐Tooth disorder) or (iv) accelerated ageing (progeria diseases). There are also clinical signs that can be associated with both genetic and acquired disease. For instance, renal cell carcinoma is characterised by the dysregulation of metabolic pathways (oxygen, iron, and nutrient sensing) which are also manifestations of rare hereditary syndromes such as Von Hippel‐Lindau (VHL, OMIM 193300) and Birt‐Hogg‐Dubé (BHD, OMIM 135150) syndromes, as well as hereditary leiomyomatosis and renal cell carcinoma (HLRCC, OMIM 150800) [5]. It is therefore essential for the research being carried out world‐ wide to focus on identifying characteristic determinants able to discriminate between specific disease states, stages, and probabilities of responding to particular treatments—put simply, biomarkers.

## **2. Biomarker: definition and utility**

Biomarkers were first described and defined in 2001 by two different review papers [6, 7], both of which suggested that they would be the key to understanding the physiopathology of disease and discovering novel treatment strategies. The classic definition of a biomarker is 'a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacological responses to a therapeutic intervention'. In other words, biomarkers are 'measurables' that rely on tools and technologies for assessing body fluids or tissue (blood, urine, cell, skin, etc.), such as DNA analysis [point variants, copy number variation (CNV), translocations, methylation analysis], RNA analysis [expression profile and microRNA (miRNA) characterisation], protein analysis (quantification of circulat‐ ing proteins), and imaging technologies, or other means of physiological measurement [8].

**Figure 1.** Literature survey. Number of citations in PubMed in which the keyword 'biomarker' is present in 'title' and/or 'abstract' from 2001 to 2015.

From their definition, the number of published papers related to biomarker discovery has increased more than 20‐fold (**Figure 1**), and the discovery and development of novel biomark‐ ers have kept pace with technical advances, in particular the advent of high‐throughput analysis technologies. Moreover, the large number of grant projects set up over the last 5 years to fund biomarker research, including BIO‐NMD [9] and NeurOmics [10], has begun to yield considerable fruits in this field.

The BIO‐NMD project is a Europe‐wide research network whose aim is to identify and validate biomarkers for rare neuromuscular diseases, such as dystrophinopathies (Becker muscular dystrophy, OMIM 300376; Duchenne muscular dystrophy OMIM 310200; dilated cardiomy‐ opathy, OMIM 302045) and COL6‐related myopathies (Bethlem myopathy, OMIM 158810; Ullrich congenital muscular dystrophy, OMIM 254090). Funded by the EU (2009–2012), BIO‐ NMD set out to investigate different human tissues/cells/fluids using multiple ‐omic strategies (genomics, transcriptomics, and proteomics), an approach that led to the identification of several biomarkers. Thanks to this project, both plasma and tissue biomarkers that will be useful for monitoring disease progression, prognosis, and treatment response have been described and will ultimately help to pinpoint appropriate options for personalised treatment [11, 12].

In a similar vein, the EU's NeurOmics project is still ongoing and aims to revolutionise diagnostics and develop new treatments for 10 major neuromuscular and neurodegenerative diseases by using sophisticated ‐omics technologies. To do this, it has brought together leading European research groups, five highly innovative SMEs, and experts from outside the EU, who are all working to identify genes and develop biomarkers for clinical application, as well as to identify drug targets and improve understanding of the physiopathology of the diseases in question.

This research activity has been largely prompted by the versatility of biomarkers. Indeed, the classical view of biomarkers as a clinical end‐point, an objective snapshot that reflects how a patient feels, functions or survives, is extremely reductive. In addition to numerous applica‐ tions in clinical settings, biomarkers may also serve as a surrogate end‐point, a predictor of clinical benefit (or lack thereof) based on epidemiological, therapeutic, physiopathological, or other scientific evidence [13]. In other words, a biomarker may act as a clinically meaningful end‐point in clinical trials. Such surrogate end‐point biomarkers are foreseeably of particular benefit in RDs, in which a high percentage of diseases without a genetic cause, slow disease progression, chronic nature of the diseases, high heterogeneity of signs and symptoms within the same phenotype, and the difficulty in objectively measuring any change in symptoms dramatically increase the expense of clinical trials. Not only the cost but also the difficulty in undertaking trials based on conventional end‐points severely curtails their number, and the lack of sensitive, specific, and timely outcome measures hinders the discovery and develop‐ ment of novel treatments.

However, a biomarker can lessen the burden of the clinical trial process by providing infor‐ mation about the safety and efficacy of treatments before the collection of definitive clinical data, which provides the opportunity for mid‐course re‐appraisal, and even interruption if the intervention being investigated is revealed as potentially harmful to participants [14]. Indeed, biomarkers are far superior to subjective measurements, which may not be directly associated to a disease characteristic, or able to detect small changes, especially in the short term. Biomarkers, on the other hand, can provide an objective measurement of aspects precisely correlated to a specific disease condition, potentially enabling small changes in status to be identified, the disease progression to be assessed, and the likely effects of the therapeutic intervention to be predicted [1] while the trial is ongoing, as well as in real‐world settings.

Considering the versatility of biomarkers, the European Medicines Agency (EMA) has attempted to standardise them by drawing up a list of the features of an 'ideal' biomarker, namely [15]:


**•** Time and cost‐effectiveness. Like a moneybox, a biomarker should be quick and easy to use and not be so expensive and time‐consuming to measure that it cannot be used as a surrogate endpoint in clinical trials or to aid diagnostics and disease monitoring.

Biomarkers that possess all these features will inevitably lead to improvements in clinical trials, especially in the field of personalised medicine. Personalised medicine shifts the current 'one‐ size‐fits‐all' approach to a more individual line of attack or defence, centred on giving 'the right drug to the right patients at the right time' [16]. This is particularly crucial in RDs, in which successful treatment development is generally hindered by the small number of patients and short runs that characterise trials for novel interventions.

# **3. Strategies for biomarker discovery**

In recent years, novel techniques and strategies have emerged for biomarker discovery, and there are currently two major approaches being applied:


#### **3.1. Discovery of genetic variations**

Next‐generation sequencing (NGS) techniques are based on high‐throughput genomic and transcriptomic sequencing. In brief, target regions can be isolated from the entire genome by hybridisation to complementary sequences. This 'capturing' is performed on demand, to isolate sequences that may consist of protein‐coding regions only (whole exome sequencing), a specifically targeted gene region (focusing on a limited number of known genes), or the entire genome (whole‐genome sequencing). The captured region can then be sequenced by one of several methods (pyrosequencing, 454 Roche; sequencing by reversible termination, Illumina; sequencing by ligation, Solid; semiconductor sequencing, Ion Torrent), and the resulting output is composed of several sequence reads, which are then computationally aligned to the known genome in order to unravel any variations, such as small insertions or deletions [17]. Unlike traditional Sanger sequencing, which reads a sequence base by base, NGS is very time‐ efficient, enabling the simultaneous analysis of millions of base pairs organised in multiple aligned reads. Despite its efficiency, however, NGS is unable to detect dynamic mutations (e.g. triplet expansions) and still has limited capability to identify CNVs. Nevertheless, while we await the development of specific algorithms to overcome these limitations, NGS can be integrated with DNA profiling tools, such as array‐CGH, for the detection of CNVs and other genetic imbalances.

The methylation profile of genes can also be explored via epigenomics. In fact, the recent advent of methylomic profiling now allows us to determine the DNA methylation status of the entire genome, and thereby to identify an increasing number of genes that are methylated in disease states, particularly cancer [18].

## **3.2. Discovery of RNA variations**

Complementary genome‐wide information technologies can be used to identify qualitative and quantitative variations at the RNA level. For example, a gene expression microarray or high‐throughput technology such as RNA sequencing (RNAseq) can be used to perform transcriptome analysis. Transcriptome profiling can be performed on samples from biopsy or cell cultures from specific affected tissues, or, less invasively, from different body fluids such as urine, blood, or saliva [1]. The technique enables the generation of enriched RNA/cDNA libraries that cover the entire transcribed region, or, alternatively, a catalogue of genes of interest that can be used to evaluate gene expression or identify novel transcripts, alternative splicing, and/or gene fusion products.

Although transcript sequencing is heavily influenced by the tissue/cell type analysed, tran‐ scription and RNA editing being profoundly tissue specific, it is highly versatile. Indeed, in addition to mRNAs, transcriptomics can be extended to non‐coding RNAs such as miRNA single‐strand sequences of 18–25 nucleotides regulating the expression of target genes already known for their role as biomarkers.

Gene expression profiling is also considered a very powerful method of identifying biomarkers of pathological status, disease progression, and/or drug response, with the advantage of exploring specific tissue behaviour [19]. Microarray technologies may be used to quantify and compare the DNA levels/configurations of many transcripts in diseased and healthy samples, or at different time points (e.g. pre‐ and post‐treatment).

## **3.3. Discovery of protein biomarkers**

The evolution of mass spectrometry (MS)‐based technologies and the development of other proteomic strategies such as two‐dimensional gel electrophoresis (2D‐DIGE) have considera‐ bly advanced our understanding of the nature of the proteome. This can be analysed to explore specific cellular functions and the control of specific biological processes, although the complexity and size of the human proteome pose larger challenges than those encountered in genomic and transcriptomic research [20]. Indeed, the individual proteome can change markedly over the course of a lifetime, and a single gene often produces very different isoforms, by alternative splicing or post‐translational modifications such as phosphorylation, glycosylation, acetylation, and ubiquitination. However, proteins are often a target for pharmacological intervention, and proteomic technologies able to evaluate the expression level of soluble proteins are emerging, thereby paving the way to the discovery and validation of protein biomarkers.

The most common novel high‐throughput approaches currently being used in discovery proteomics are those based on MS. These technologies enable the analysis of complex mixtures of proteins, measuring the mass‐to‐charge ratio of charged particles in order to determine their mass, quantity, and elemental composition. There are essentially two different types of MS approaches, namely top‐down experiments, which analyse the whole protein, and bottom‐up, which analyse proteins previously digested by proteases. For characterisation purposes, the resulting peptide mixtures may then be separated using different strategies, such as liquid chromatography (LC), gas chromatography, or ion mobility spectrometry, and then the identified proteins can be quantified. To achieve this, samples can be isotopically labelled by different methods, such as stable isotope labelling by amino acids (SILAC), isotype‐coded affinity tagging (ICAT), isobaric tags for relative and absolute quantification (iTRAQ), and mass tags for relative and absolute quantification (mTRAQ) [21]. A typical MS protocol would therefore consist of sample loading (of intact or digested protein), vaporisation, ionisation, and separation of the ionised sample by mass‐to‐charge ratio, detection in an MS instrument, and generation of a detailed profile of the exact chemical composition of a sample.

By these means, it is possible to differentially analyse proteins from different biological processes or disease states in order to discover candidate biomarkers. Many biomarkers used in existing clinical practice are assays to quantify proteins, and proteomics techniques such as 2D‐DIGE can be used to separate non‐digested proteins within a biological sample based upon either apparent molecular mass (by gel electrophoresis) or charge (via isoelectric focusing). Such strategies thereby provide a measure of protein abundance and enable the identification of isoforms and post‐translational modifications [21]. Validation of such potential biomarkers can be performed using a common protein expression method such as Western blotting and/or antibody‐based assays.

As shown in the workflow illustrated in **Figure 2**, biomarker discovery can be facilitated by using a strategy combining two or more of the above approaches, for example


The benefit of multiple ‐omics approaches has been clearly demonstrated by Finkel et al., in their recent 'BforSMA' cross‐sectional study aimed at identifying novel biomarkers in spinal muscular atrophy (SMA, OMIM: 253300). SMA is a neuro‐ degenerative motor neuron disorder caused by homozygous/compound hetero‐ zygous mutations in the motor neuron 1 (SMN1) gene [22]. It is characterised by the degeneration of the anterior horn cells of the spinal cord and leads to sym‐ metrical muscle weakness and atrophy. The SMN protein plays a crucial role in RNA biosynthesis in all tissues, forming a large, multiprotein complex that drives the assembly of small nuclear ribonucleoproteins (snRNPs) of the spliceosomes. Through functions in RNP assembly, the SMN complex is required for the expression of essentially all protein‐coding genes [23]. Preliminary results from the 'BforSMA' project—based on proteomics, metabolomics, and transcriptomics discovery platforms—indicate the discovery of a total of 200 candidate biomark‐ ers, including 97 plasma proteins, 59 plasma metabolites, and 44 urine metabolites that could potentially be used to address clinical trial design and identify novel therapeutic targets in SMA [22].

**Figure 2.** Flowchart of biomarker discovery. Different technologies from genomic, transcriptomic, and proteomic levels are able to detect potential biomarkers; subsequently the connection between the three approaches and validation may confirm the biomarker identification.

# **4. Molecular biomarkers**

#### **4.1. Genomic biomarkers**

New molecular biomarkers could be detected at different levels. According to the Food and Drug Administration/EMA definition, genomic biomarkers include both DNA and RNA determinants, and genomic biomarkers therefore include DNA methylation status and sequence variations, such as single‐nucleotide polymorphisms (SNPs), insertions, deletions, translocations, CNV, as well as RNA alterations such as differential gene expression and miRNAs (**Figure 3**). The current research focus has shifted somewhat, from SNP to haplotype analysis, which it is hoped will furnish useful disease, prognostic, or predictive biomarkers.

**Figure 3.** Schematic representation of the main features (applications, troubles, and advantages) of biomarker types, in terms of stability, specificity, repeatability, and accessibility.

Indeed, DMD patients, for example, despite having common features such as the absence of dystrophin in the striated muscles, show different rates of disease progression, especially in terms of the age of loss of ambulation. This supports the idea that genetic modifiers exist and can influence both the phenotype and the clinical severity of the disease. To this end, Flanigan et al. identified SNPs located within the LTBP4 gene, which encodes for the latent transforming growth factor (TGF) b binding protein (LTBP), in more than 200 patients, showing that individuals homozygous for the IAAM LTBP4 haplotype remained ambulatory significantly longer than those heterozygous or homozygous for the VTTT haplotype [24]. Furthermore, in long QT syndrome (LQTS)—a rare hereditary cardiac disorder characterised by a prolongation of the QT interval due to mutations in genes encoding ion channels responsible for the generation of electrical impulses—it appears that the haplotype group C‐G‐T of the heat shock protein HSP‐70 gene is strongly related to the disease condition and may therefore represent a diagnostic biomarker [25].

An example of potential RNA biomarkers has been provided in a study by Harten et al. into Hutchinson‐Gilford progeria syndrome (HGPS, OMIM: 176670). This is a rare, fatal, autosomal dominant premature‐aging disease (prevalence: <1/1,000,000) caused by splicing mutations in the LMNA gene that creates cryptic splice sites and leads to the production of progerin, a toxic, permanently farnesylated splicing variant [26]. In their study, the authors analysed the expression profile of several matrix metalloproteinases, identifying a donor‐age‐dependent

reduction in the expression of MMP‐3 mRNA in HGPS primary dermal fibroblast cultures, suggesting that a fall in MMP‐3 correlates with disease severity in vivo [26].

RNAseq can be used in conjunction with new technologies such as NGS to analyse the whole transcriptome both quantitatively and qualitatively and thereby provide information about alterations in gene expression. This approach can potentially speed up the process of genomic biomarker discovery and was used to good effect in a recent study aimed at tracing a detailed RNA profile in both collagen VI myopathy (ColVI) patients and an animal model of the same. Collagen VI myopathies are genetic disorders arising from mutations in the collagen VI genes; they range from the severe Ullrich congenital muscular dystrophy (UCMD, OMIM: 254090, prevalence: 1–9/1,000,000) to the milder Bethlem myopathy (BTHLM1, OMIM: 158810, prevalence: <1/1,000,000), which can both be inherited via both dominant and recessive models. Generally speaking, neither the type of mutation nor the effect of the mutation on the protein structure/function allows precise discrimination between two phenotypes. However, by a combined RNAseq approach, the authors identified the potential involvement of circadian genes, reporting a marked deregulation of the CLOCK gene in UCMD patients alone, sug‐ gesting it as a candidate biomarker of disease severity in ColVI [27].

miRNAs also make quite appealing biomarkers, and a recent study by Eisenberg et al. found that the levels of muscle‐specific miRNAs (myomirs) are correlated with disease severity in several muscular dystrophies, including limb girdle and Duchenne/Becker muscular dystro‐ phies [28]. miRNA studies have also been extended to other RDs, such as cystic fibrosis (CF, OMIM: 219700). This is a recessive genetic disorder (prevalence: 1–9/100,000) characterised by eccrine gland dysfunction, chronic obstructive lung disease, and exocrine pancreatic dysfunc‐ tion. It is caused by mutations in the cystic fibrosis conductance regulator gene (CFTR), and it appears that miR‐494 and miR‐145 are significantly over‐expressed in CF tissues with respect to those of healthy individuals, suggesting their role as disease biomarkers [29].

As mentioned above, genomic biomarkers also include epigenomics modifications such as DNA methylation. Recent studies on Friedreich ataxia (FRDA, OMIM 229300), the most common ataxia, which is caused by an expanded GAA repeat in the first intron of FXN, have demonstrated that hypermethylation of the gene region upstream of the expanded GAA repeat correlates with clinical severity, while hypomethylation of the downstream region correlates with the age at onset [30]. It is evident, therefore, that genomic biomarkers may have a wide spectrum of functions as clinical and research outcome measures.

#### **4.2. Proteomic biomarkers**

Proteomic studies have several advantages over genomic analysis, not least the potential identification of biomarkers more closely related to biological function/dysfunction. Further‐ more, proteomic biomarkers are more readily accessible than genomic biomarkers, being detectable in body fluids such as blood and urine (**Figure 3**). This makes them potentially useful in clinical trials as early indicators of the disease condition, disease progression, or treatment effects (drug response or adverse effects).

As an example, Martell et al. have provided a clear indication of biomarker accessibility and utility in Morquio A syndrome, also named mucopolysaccharidosis IVA (MPS, OMIM: 253000, prevalence: 1–9/1,000,000). This recessive lysosomal storage disorder is caused by a mutation in N‐acetylgalactosamine‐6‐sulfatase gene (GALNS), which codes for keratan sulphate and chondroitin‐6‐sulphate. The mutation results in a wide spectrum of clinical features involving skeletal, cardiac, pulmonary, corneal, and hearing impairment, and the identification of biomarkers able to monitor the response to enzyme replacement therapy during clinical trials is long past due. To this end, the authors measured the plasma levels of 88 candidate proteins, finding that three of them (alpha‐1‐antitrypsin, lipoprotein a, and serum amyloid P) may be suitable surrogate end‐points for clinical trials [31].

The main advantage of techniques that can assess biomarkers in body fluids is, of course, their lack of invasiveness. In this regard, a new protein technology, the SOMA scan assay—an aptamer‐based method able to recognise specific protein epitopes—has been used to evaluate protein levels in the sera of DMD patients. By using this technology to compare serum samples from two independent DMD cohorts with healthy individuals, 44 serum biomarkers were identified [32]. Similarly, Auray‐Blais et al. have recently applied novel MS‐based high‐ throughput technologies to protein biomarker discovery in the urine samples of patients affected by Fabry disease, succeeding in identifying the lyso‐Gb3/related analogue profile as a diagnostic biomarker [33].

Low invasiveness is also a feature of the most commonly used method of measuring and validating protein biomarkers, the immunoassay. Immunoassays are based on the ability of monoclonal antibodies to capture and detect specific protein domains and enable the simul‐ taneous investigation of several proteins using very low amounts of samples. For example, in idiopathic pulmonary fibrosis (IPF, OMIM 178500), a rare lethal lung disease (prevalence: 1– 5/10,000) of unknown aetiology and variable and unpredictable course, a multiplexed assay has been used to simultaneously evaluate 92 proteins in plasma samples from more than 200 patients. By these means, three biomarkers predictive of IPF outcome were identified [34]. Other studies have used the ELISA immunoassay to evaluate serum levels of an extracellular matrix glycoprotein, tenascin‐C (TN‐C), in Emery‐Dreifuss muscular dystrophy (EMD, OMIM 310300), a rare neuromuscular disorder (1–9/1,000,000) characterised by muscular weakness and atrophy, with early joint contractures and cardiomyopathy, finding an association between elevated circulating TN‐C levels and an increased risk of developing dilated cardio‐ myopathy [35].

Due to the low invasiveness of the methods involved, proteomic biomarkers are also very appealing as surrogate end‐points in clinical trials and/or screening (e.g. neonatal testing).

#### **4.3. Other biomarkers**

As mentioned earlier, imaging technologies, and indeed any diagnostic test that is able to measure the disease status in patients, are useful for measuring, and therefore for investigating certain biomarkers. Magnetic resonance imaging (MRI), for example, is a safe and non‐invasive method of analysing muscle, connective tissue, fat, and bone. Indeed, Kinali and co‐workers have demonstrated that the MRI scan, focused on particular muscles, can serve as a biomarker for disease progression in Duchenne muscular dystrophy (DMD, OMIM: 310200), a rare neuromuscular disease (affecting 1/3300 male births) characterised by rapidly progressive muscle weakness and wasting due to degeneration of skeletal, smooth, and cardiac muscles. MRI can be used to accurately identify which type of muscles is sufficiently preserved in DMD, making it a reliable tool for use in clinical trials. Similarly, MRI scans of muscle biopsies are currently being used to correlate the clinical features of muscle diseases with the structure and morphology of muscle fibres [36].

Neurophysiological measurements can also be exploited as imaging biomarkers. For instance, Vucic and colleagues have reported that transcranial magnetic stimulation (TMS) is a useful and non‐invasive method of assessing the functional integrity of the motor cortex and its corticomotoneuronal projections in ALS. Despite their similarities, TMS was able to reliably distinguish between ALS and similar peripheral disorders, thereby demonstrating its potential diagnostic utility [37].

In fact, imaging biomarkers are generally considered very appealing, generating a large amount of intensive research in recent years. The ultimate aim of such research is the devel‐ opment of innovative methods of using imaging tools for the detection and monitoring of the signs and symptoms of RDs.

# **5. Applications and clinical translation of biomarkers**

### **5.1. Diagnostic/prognostic biomarkers**

A diagnostic, or prognostic, biomarker is one that identifies a disease or quantifies its patho‐ genic factors (**Figure 4**). Essentially, they are signatures that divide the population into healthy and diseased individuals, but in some cases they can finely stratify the disease phenotype into different degrees of severity or sub‐phenotypes. The routine diagnostic markers classically used in clinical practice are temperature, blood pressure, and cholesterol levels, among others, whereas in genetic diseases, according to the IRDiRC statement [4], all gene mutations known to cause a Mendelian disease have to be considered their primary genetic biomarkers. For example, DMD, the most common fatal genetic disorder diagnosed during early childhood, arises through mutations in the causative dystrophin (*DMD*) gene, which are therefore considered disease biomarkers, and can accordingly be used to select patients for enrolment in clinical trials [38].

In some cases, mutations in causative genes can be considered biomarkers of disease severity. This is the case in fragile X syndrome (FXS, OMIM: 300624), a rare intellectual disability disorder with an estimated prevalence about 1 in 2500 to 5000 men and 1 in 4000 to 6000 women. FXS is caused by an expanded CGG triple‐repeat located within the 5' UTR of the FMR1 gene. The triplet expansion variability defines four different phenotypes, ranging from healthy to a severe phenotype, and can therefore be used to distinguish between them [39].

In ALS (OMIM 105400), the situation is less clear cut. ALS is a devastating neurodegenerative disease with an incidence of 1/50,000 per year. Although several mutated genes have been

identified in ALS (DCTN1, OMIM 601143; PRPH, OMIM 170710; SOD1 OMIM 147450; NEFH OMIM 162230), the vast majority of patients do not show a defined genetic defect. This would seem to indicate that the causative gene is still missing [40], and research in this area has therefore focused on the discovery of specific biomarkers able to assist clinical diagnosis and monitor the disease progression. In this regard, Hwang et al. have correlated an increased level of HMGB1, non‐histone architectural protein, in serum samples with the onset of ALS, even in early stages of the disease. This increased level of HMGB1 could also be useful as a severity biomarker, since they also found higher HMGB1 levels in patients with a severe disease status [41]. Moreover, the same group has recently correlated a reduction in the protein level of LG72 gene, activator of D‐amino acid oxidase, to the pathogenesis of ALS [42].

**Figure 4.** Schematic representation of biomarker application. Biomarker may identify, within a population, the individ‐ uals affected by a specific disease and then select patients able to respond to treatment/intervention.

Another example of a diagnostic biomarker has been found in Alexander disease (ALXDRD, OMIM: 203450), a very rare neurodegenerative disorder (incidence of 1/2.7 million per year) characterised by varying degrees of macrocephaly, spasticity, ataxia, and seizures. It ultimately leads to psychomotor regression and death, and causative mutations have been identified in Glial Fibrillary Acidic Protein (GFAP), the major intermediate filament protein of astrocytes, which result in toxic accumulation of the protein. Animal model studies have demonstrated that transactivation of the GFAP promoter is an early indicator of the disease process, and that GFAP level in the CSF could be a potential biomarker in human patients [43].

Biomarkers used in clinical practice to improve disease progression monitoring or disease‐risk prediction are defined as prognostic. Simply put, a prognostic biomarker provides information on the course of a disease in an untreated individual, and an example has been identified for Marfan syndrome (MFS, OMIM: 154700), a systemic disease of the connective tissue charac‐ terised by a wide spectrum of cardiovascular, skeletal muscular, ophthalmic, and pulmonary manifestations. With an estimated prevalence of around 1/5000, patients affected by MFS suffer from an increased risk of cardiovascular complications that lead to premature death, and a correlation has been demonstrated between the larger aortic root diameters, coupled to a faster aortic root growth, and high serum levels of transforming growth factor‐β (TGF‐β). Increasing levels of TGF‐β predict cardiovascular events and thereby possesses significant prognostic value [44].

Another biomarker for cardiac muscular involvement has been found in Fabry disease (FD, OMIM: 301500), a rare systematic disease (prevalence 1–5/10 000) characterised by the accumulation of globotriaosylceramide in the plasma and cellular lysosomes of vessels, nerves, tissues, and organs throughout the body. This accumulation leads to progressive skin lesions, renal failure, cardiac and cerebrovascular involvement, and peripheral neuropathy. Continu‐ ously elevated cardiac troponin I (cTNI), a laboratory parameter well known to reflect acute and chronic cardiac muscle damage, has been demonstrated in a substantial proportion of patients with FD, suggesting that raised cTNI levels could be a useful laboratory marker for assessing myocardial damage in FD [45].

Finally, a recent study on DMD has indicated the matrix metalloproteinase‐9 (MMP‐9) as both a diagnostic and prognostic biomarker. Indeed, DMD patients showed a higher serum level of MMP‐9 protein and tissue inhibitors of metalloproteinase‐1 (TIMP‐1) proteins with respect to controls, with MMP‐9 levels being even higher in older, non‐ambulant patients than in ambulant patients [46].

### **5.2. Predictive/therapeutic biomarker**

Considering the heterogeneous nature of RDs, not all patients are expected to benefit from a newly available treatment. Hence the identification of a sub‐group of patients likely to respond to a novel treatment is important both in terms of health, and in terms of cost‐effectiveness [12]. To this end, a predictive, or therapeutic, marker must be able to discriminate between drug responders (patients gaining benefit from the therapy) and poor/low responders (**Figure 4**). Predictive biomarkers will therefore enable the most appropriate and efficacious treatments or interventions to be selected for each patient, thereby underpinning a personalised approach to treatment.

There are a few examples of therapeutic biomarkers useful in RDs, generally SNPs, as in typical pharmacogenetics, although some protein studies have also been reported. For instance, a pharmacological predictive biomarker has been reported in idiopathic nephrotic syndrome, a RD affecting the kidneys. Specifically, Wen et al. [47] found a significant difference in the serum proteome of steroid‐sensitive nephrotic syndrome (SSNS) and steroid‐resistant nephrotic syndrome (SRNS, OMIM 256370) patients, predictive of their respective responses to treat‐ ment.

Another example of a predictive biomarker has been found for Gaucher disease (GD, OMIM: 230800), a rare recessive genetic disorder (approximate prevalence 1/100,000) caused by mutations in the GBA gene, which codes for a lysosomal enzyme, glucocerebrosidase. Although the clinical manifestations of this disease are extremely variable, ranging from non‐ neurological manifestations such as organomegaly, bone anomalies, and cytopenia to acute neurological forms, a recent time‐course analysis of ferritin, chitotriosidase, haemoglobin, and platelets showed that the levels of these biomarkers undergo variation during the course of enzyme replacement therapy [48].

Sometimes the same biomarker can be useful in multiple scenarios and, for example TGF‐c, in addition to serving a prognostic function in Marfan syndrome, could feasibly be used as a therapeutic biomarker in the same condition. Indeed, in a recent study, patients who respond‐

ed to losartan used to reduce the aortic root dilatation rate, had higher baseline TGF‐β levels but exhibited lower plasma TGF‐β concentrations during losartan therapy [49].

Predictive biomarkers such as these are likely to play an increasingly important role in clinical practice, since evaluating the efficacy of a treatment/intervention is fundamental to making decisions about treatment choices, and therefore determining therapy outcomes.

# **6. Conclusions**

Since the definition of biomarkers in 2001, their importance in clinical and research settings has increased dramatically due to their diagnostic/prognostic functions and their ability to monitor/predict disease stage, treatment response, and/or adverse effects. Indeed, the creation of an exhaustive catalogue of approved biomarkers may be the single most important inno‐ vation in healthcare, bringing considerable clinical and economic benefits. Although current research, both academic and corporate, is heavily focused on the development of drugs and companion diagnostic tests, in the future, biomarker discovery and development will be vital for tailoring medical care to individual patients. This will be especially important in the field of RDs, in which the discovery of efficacious biomarkers is likely to greatly facilitate the process of EMA approval and development of novel orphan drugs. In addition to being both and time and cost‐effective, biomarker research also provides exciting opportunities to expand our knowledge of the physiopathological mechanisms behind rare and other diseases, helping to discriminate between distinct disease presentations and comorbidities, as well as predict the different impacts of concomitant medication, and various important demographic parameters such as gender, age, and ethnicity. In short, biomarker discovery represents a giant leap towards the ultimate goal of truly personalised medicine.

## **Acknowledgements**

The BIO‐NMD (http://www.bio‐nmd.eu/) (European Union FP7 #241665) and Neuromics (European Union FP7 #305121) projects are acknowledged.

# **Author details**

Chiara Scotton and Alessandra Ferlini\*

\*Address all correspondence to: fla@unife.it

Medical Genetics Unit, Department of Medical Sciences, University of Ferrara, Ferrara, Italy

## **References**


relation to clinical development and patient selection. (Draft) EMA/446337/2011. Available from: http://www.ema.europa.eu/docs/en\_GB/document\_library/Scientif‐ ic\_guideline/2011/07/WC500108672.pdf [Accessed: 2016‐02‐28].


Hutchinson‐Gilford progeria syndrome. J Gerontol A Biol Sci Med Sci. 2011;66:1201– 1207. DOI: 10.1093/gerona/glr137


in the course of Emery‐Dreifuss muscular dystrophy. Clin Chim Acta. 2011;412:1533– 1538. DOI: 10.1016/j.cca.2011.04.033

